The Shunting Yard Algorithm, written by Dijkstra, converts an infix expression into a post-fix expression. For example, the expression:
becomes
Precedence and Associativity
What makes this difficult is that operators, generally, have some notion of precedence in most languages. For example, multiplication typically has a higher precedence than addition. So the expression 3 + 5 * 6 is 33 (multiplication happens first) rather than 40.
There's also the issue of associativity. For example, 3 + 4 + 5 happens left-to-right, or it is left-associative. On the other hand, assignment is right-associative. For example, a = b = c = 4, all values receive the value 4 because c = 4 is performed first, while b is assigned to theresult of the assignment of c, which is 4.
Note: This kata is in flux, I've added the link to the C++ operators but the examples use ^ as "to the power of" rather than bitwise negate. Follow the examples until I update the list of tests.
Examples
Here is a series of examples with expected results:
Example Infix Expression
Expected Postfix Result
Notes
<empty string>
<empty string>
A good place to start, you'll create the basic translator class and get a value back.
<null>
<empty string>
Make sure you handle one error case, a null string, by returning a reasonable default value of an empty string.
45
45
Make sure you can handle a simple, literal number. This test may work out of the box, if not, it should be simple to get it to work.
+
+
Make sure you can handle a simple operator. This test will probably work out of the box after getting the previous test to pass. Later on, this test represents a error case of sorts since there are no operands for this operator.
3 + 8
3 8 +
Your first "real" test, can you perform some basic translation?
2 + 9 - 6
2 9 + 6 -
You probably hard-coded the first example, this makes you do a little more work. And it adds a second operator to boot.
2 + 9 * 6
2 9 6 * +
This is the first example where operator precedence makes a difference.
2 * 10 ^ 6
2 10 6 ^ *
The ^ (raised to the power of) operator is typically a right-associative operator.
2 ^ 3 ^ 4
2 3 4 ^ ^
This was a bit of a dud test I added. It immeidately passed. Might consider leaving this one out.
a ^ 3
a 3 ^
I want my expression evaluator to handle symbols as well as literal numbers. This example forces that behavior.
(3 + 4)
3 4 +
Handle ()'s. First, simply remove them.
(3 + 4) * 5
3 4 + 5 *
Now make sure that ()s change the order of evaluation since + typically happensafter *.
(3+(4-5))*6
3 4 5 - + 6 *
A more complex example to verify that the solution is general. Also notice that the spaces between separate tokens has been removed. You'll need to handle both with and without spaces.
f(3)
3 f
This is an example of a function call. Notice, this forces "look ahead" in your algorithm because you do not know if something is a function or a variable unless you look ahead to the next token.
f(g(4))
4 g f
Is your solution general enough to handle nested functions?
f(3, 4, 19)
3 4 19 f
Can your solution handle multiple parameters to a function call?
I added this some time later. It incorporates just about everything already done and it specifically pushes handling multiple parameters to a function correctly. I expected this to take some time, but it really did not take much (a few minutes).
One Approach
The examples above are taken from a recent (June 2009) personal walk-through of the algorithm. I was surprised by my results and the ease with which I was able to implement the algorithm.
I recommend the following approach when trying this problem:
Use the examples with the expected results shown above
Work top to bottom in the order shown
Addone example at a time and get all tests to pass
Review both your test code and production code, refactoring as you see fit
One note, when I worked on this, I used Java with JUnit 4's parameterized tests. Here's what the final test file looks like:
This worked OK, but occasionally I wanted to run just one test. This does not seem to be supported, so I had to comment out various lines in thedata() method. The next time I work on this, I'll probably use individual test methods.
Getting Started
Here is one possible start to this problem (with the first three tests added and passing): The Test
Note: Occasionally I experiment with different ways of writing unit tests with JUnit. This is one such example where I attempt to simulate the domain specific language used by cucumber.
I take the approach of splitting the expression into individual tokens and then processing them one by one. To do this, I use regular expressions and to format the output, I use String.format(...). Here are a few hints along those lines:
"3 + 4".split("\\s") - slits the string into an array: {"3", "+", "4"}.
"3 + 4".split("\\s+") - same as above, but allows for 1 or more spaces.
String.format("%s %s %s", "3", "4", "+") --> "3 4 +". Consider using String.format to format your output.
When searching for operators using regular expressions, remember to escape them. E.g., "\\+". You can generally escape any character.
Eventually, splitting the expression will be complex enough that it will warrant its own class with its own micro tests.
The operators have difference associativity and precedence. Eventually, you'll probably want to create a class for that.
Eventually you'll probably want a simple factory that can map a string into an operator.
If you want to validate a number, you can use a regular expression or, in my case, the DecimalFormat class.
What We Accomplished
Here's a zip of the source files we created during the dojo:
Background
The Shunting Yard Algorithm, written by Dijkstra, converts an infix expression into a post-fix expression. For example, the expression:Precedence and Associativity
What makes this difficult is that operators, generally, have some notion of precedence in most languages. For example, multiplication typically has a higher precedence than addition. So the expression 3 + 5 * 6 is 33 (multiplication happens first) rather than 40.There's also the issue of associativity. For example, 3 + 4 + 5 happens left-to-right, or it is left-associative. On the other hand, assignment is right-associative. For example, a = b = c = 4, all values receive the value 4 because c = 4 is performed first, while b is assigned to the result of the assignment of c, which is 4.
For the purposes of this exercise, consider these precednece rules from C++.
Note: This kata is in flux, I've added the link to the C++ operators but the examples use ^ as "to the power of" rather than bitwise negate. Follow the examples until I update the list of tests.
Examples
Here is a series of examples with expected results:One Approach
The examples above are taken from a recent (June 2009) personal walk-through of the algorithm. I was surprised by my results and the ease with which I was able to implement the algorithm.I recommend the following approach when trying this problem:
One note, when I worked on this, I used Java with JUnit 4's parameterized tests. Here's what the final test file looks like:
This worked OK, but occasionally I wanted to run just one test. This does not seem to be supported, so I had to comment out various lines in the data() method. The next time I work on this, I'll probably use individual test methods.
Getting Started
Here is one possible start to this problem (with the first three tests added and passing):The Test
Note: Occasionally I experiment with different ways of writing unit tests with JUnit. This is one such example where I attempt to simulate the domain specific language used by cucumber.
The Production Code
And here's the results after the first three tests:
Hints
I take the approach of splitting the expression into individual tokens and then processing them one by one. To do this, I use regular expressions and to format the output, I use String.format(...). Here are a few hints along those lines:What We Accomplished
Here's a zip of the source files we created during the dojo:Here's a version I've worked on a bit more:
<--Back